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i_2 ' Abstract 

We consider quantum computations comprising only commuting gates, known as IQP 
O^ ' computations, and provide conrpelling evidence that the task of sampling their output prob- 

ability distributions is unlikely to be achievable by any efficient classical means. More specif- 
f~i ' ically we introduce the class post-IQP of languages decided with bounded error by uniform 

Qh| families of IQP circuits with post-selection, and prove first that post-IQP equals the classical 

class PP. Using this result we show that if the output distributions of uniform IQP circuit 
H \ families could be classically efficiently sampled, even up to 41% multiplicative error in the 

t—{ • probabilities, then the infinite tower of classical complexity classes known as the polynomial 

^^1 hierarchy, would collapse to its third level. We mention some further results on the classical 

simulation properties of IQP circuit families, in particular showing that if the output distri- 
bution results from measurements on only 0(log n) lines then it may in fact, be classically 
^ ' efficiently sampled. 

o 

lO ■ 1 Introduction 

O 
o . 

From a pragmatic point of view the field of quantum computing is driven by the expectation 
that quantum algorithms can offer some computational complexity benefits transcending the 
possibilities of classical computing. But this expectation can be challenged both theoretically and 
experimentally: (a) there is yet no theoretical proof that any quantum algorithm outperforms 
C^ \ the best classical algorithm for the task, in the standard computational setting of polynomial 

vs. exponential running time (without inclusion of further constructs, such as use of oracles, or 
consideration of distributed computing and the role of communication; in both these scenarios 
there are indeed proofs of exponential complexity benefits); (b) experimentally there are well 
documented difficulties associated with building a quantum computer that is suitably fault 
tolerant and sufficiently scalable to manifestly demonstrate a complexity benefit. 

However both (a) and (b) can, to some extent, be redressed by further examination: the 
criticism in (a) can be attributed to limitations of classical complexity theory ~ we do have 
interesting quantum algorithms (such as Shor's factoring algorithm) for problems widely believed 
to be classically hard but there is no proof of the latter. Proof of classical hardness is a notoriously 
difficult issue (cf the famous P vs. NP question) and it has become popular to resort to providing 
only evidence of hardness, as follows: we prove that if a certain problem were classically easy then 
this would entail consequences that are highly implausible (although also generally unproven) 



^ 



e.g. collapse of an entire complexity class (such as entailing that P = NP). For (b) we could 
seek to devise a computational task that, on the one hand is expected to be classically hard (as 
above) yet on the other hand, can be implemented using suitably simple (sub-universal) quantum 
computational elements that are especially easily or fault-tolerantly implementable within some 
specific experimental scheme. In this paper we develop a family of such computational tasks (that 
amount to sampling from suitably prescribed probability distributions). Recently a different 
approach to similar issues has been described by Aaronson in [20]. More generally there has 
been increasing interest in physically restricted forms of quantum computing and a study of 
associated complexity classes [1, 2, 4, 11, 12]. 

We consider so-called temporally unstructured quantum computations (also known as IQP or 
"instantaneous" quantum computation) introduced in [3, 4]. Our main result is to demonstrate 
that if quantum circuits comprising 2-qubit commuting gates could be simulated classically (even 
up to a generous multiplicative error tolerance as described below) then the infinite tower of 
complexity classes known as the polynomial hierarchy (PH), would collapse to its third level. 
While not implying that P=NP, such a collapse is nevertheless widely regarded as being similarly 
implausible. Apart from their tantalising theoretical simplicity, such circuits of only commuting 
gates are known to be of significance for super- and semi-conductor qubit implementations, 
where it has recently been shown [5] that they are much simpler to implement fault-tolerantly 
than gates drawn from a fully universal set. 

A significant ingredient in our derivations will be the notion of a post-selected quantum 
computation. Aaronson [6] has shown that if post-selection is included with universal polynomial 
time quantum computation then the computational power is boosted from BQP to the classical 
class PP. We will show that, somewhat surprisingly, post-selection boosts the power of the much 
weaker class of polynomial time IQP computations to PP too. 

The notion of classical simulation that applies in our main result is an especially weak one - 
broadly speaking (cf precise definitions below) given a description of a quantum circuit we ask 
for a classical process that can provide a sam,ple of a probability distribution, that approximates 
the output distribution of the quantum process to a suitable multiplicative accuracy. A very 
much stronger notion of simulation sometimes used in the literature (which we shall call a strong 
simulation) is to ask for a classical efficient computation of the value of any marginal or total 
output probability, to exponential precision. Previously it was known [13, 14] that the existence 
of such strong simulations for some classes of quantum computations would imply the collapse 
of the polynomial hierarchy. Our result contrasts with these works in the striking simplicity 
of the quantum processes being considered and in the very much weaker requirements in the 
classical simulation. 



2 Preliminary notions 

We begin by introducing some definitions and notations needed to give precise statements of 
our main results. 

2.1 Computational tasks 

Conventionally a computational task T is a specified relationship between inputs w = xi . . .Xn 
and outputs yi . . . ym = T{w) which are taken to be bit strings. The length n of the input string 



is called the input size. A computational process C with (generally probabilistic) output C{w) 
on input w is said to compute T with hounded error if there is a constant < e < | such that 
for all inputs, prob \C{w) = T{w)] > 1 — e. C computes T with unbounded error if for all inputs 
we have prob [C{w) = T{w)\ > ^. If the output of T is always a single bit then T is called a 
decision task associated to the subset {w : T{w) = 1} of all bit strings. A subset of bit strings 
is called a language. 

A more general kind of computational task involves merely the sampling of a probability 
distribution on m-h\t strings whose result is not necessarily associated to any desired "correct" 
outcome j'l . . . jm as above. For example, for each n-bit string w we may have an associated 
quantum circuit Cw with output probability distribution Py^ on m-bit strings, and we may be 
interested to know how hard it is to sample from P^j by purely classical means, given a description 
of the circuit Cw ■ 

2.2 Uniform families of circuits 

We shall use a notion of uniform circuit family that is slightly different from the standard 
textbook definition, motivated by a desire to make more transparent the contribution of the 
uniformity condition to the final overall computational process. 

In the Turing machine model of computation a single machine suffices to deal with inputs 
of all sizes. In contrast in the circuit model, any single circuit has a fixed number of input 
lines so to treat inputs of all sizes it is conventional to introduce the notion of a circuit family 
{Cn} = {Ci, C2, . . .} with Cn being a circuit intended to perform the computation for all inputs 
of size n. In this formalism we need to impose an auxiliary uniformity condition specifying 
computational limitations (cf below) on how the (descriptions of the) circuits C„ themselves are 
generated as a function of n. In the absence of any such condition, hard (or even uncomputable) 
computational results may, by fiat, be hard wired into the varying structure of the circuits Cn 
with n. In standard treatments a circuit family {C„} is parameterised by input size n (with 
Cn being a circuit processing all inputs of size n). For our purposes it will be more convenient 
to parameterise the circuit family by the inputs w = xi . . .Xn themselves, with circuits always 
acting on a standard input such as ... (or |0) . . . |0) for quantum circuits), resulting in circuit 
families denoted {C^}. Thus for example in comparison with the standard definition, we could 
take the circuit C^ to be the circuit C„, prefixed by some NOT gates (depending on w) that 
initially convert the input ... into w. Our formal definition is as follows. 

Definition 1 A uniform family of circuits (of some specified type) is a mapping w — t- C^ where 
w = xi . . . Xn is a bit string of length n, C^ is a (classical) description of a circuit (of the 
appropriate type) and the mapping w — )■ Cw is computable in classical po\y{n) time. Here the 
description Cw includes (i) a specification of a sequence of gates and lines upon which they act, 
(a) a specification of the inputs for all lines (often taken to be . . .0 resp. |0) . . . |0) for classical 
resp. quantum circuits), (Hi) a specification of which lines comprise the output register, and (iv) 
a specification of any other registers needed for a circuit of the type being used (e.g. a register of 
lines initialised to random bit values for randomised computation, or a register of post-selection 
lines for post-selected computations, as defined later). 

Associated to any uniform circuit family we have a family of probability distributions {Pw} 
(on m-bit strings where m is the size of the output register of Cw), defined by the output of the 
computational process described by Cw. 



Since w — )• C^ is computable in poly(n) time, each circuit C^ has poly(n) size and acts on 
at most poly(n) hues. One may entertain other uniformity conditions e.g. having w — )• C^ 
computable in classical log space (as is generally adopted for n — t- C^ in the textbook definition 
of uniform families). For us the poly(n) time uniformity condition is adequate, as we are 
primarily interested in circuits whose computational power is potentially stronger than, or not 
commensurate with, classical deterministic polynomial time. Our uniformity definition (based 
on inputs w rather than just input sizes n) then transparently simply prefixes the processing 
power of the circuits with arbitrary classical deterministic polynomial time computation. 

For classical deterministic polynomial time computation in our circuit family definition, the 
computation can be totally represented within the uniformity stage w — >• C^ and the C^'s can be 
taken to be trivial circuits that perform no further computation beyond outputting the obtained 
answer. Classical randomised polynomial time computation is modelled by circuits C^ that have 
a designated register of lines (disjoint from the input register) which is initialised with random 
bits for each run of the computation C^. Such circuits are called classical randomised circuits. 
The complexity class of decision tasks decided with bounded error (resp. unbounded error) by 
uniform families of classical randomised circuits is denoted BPP (resp. PP). It is well known 
that BPP is independent of the value of the constant error tolerance e. For universal polynomial 
time quantum computation the circuits C^ comprise quantum gates, each acting on a constant 
number of lines. The input is taken to be the standard state |0) . . . |0) and the output is the 
(probabilistic) result of a computational basis measurement on a designated register of output 
lines. The class of decision tasks solved with bounded error by such uniform families is denoted 
BQP. (This definition is easily seen to be equivalent to other standard definitions of BQP such 
as in [17]). 

2.3 IQP circuits 

We now come to our notion of quantum computations comprising commuting gates. In [4] 
these have been called IQP ("instantaneous quantum polynomial time") computations since in 
quantum physics, such gates may be applied simultaneously. 

Definition 2 An IQP circuit on n qubit lines is a quantum circuit with the following structure: 
each gate in the circuit is diagonal in the X basis {|0) ± |1)}, the input state is |0) |0) . . . |0) and 
the output is the result of a computational basis measurement on a specified set of output lines. 

In this paper we will assume that each gate in the description of an IQP circuit C^ is specified 
by giving its diagonal entries and the lines on which it acts. Thus a poly(n) sized description 
implies that any gate acts on at most 0{logn) lines. We note however that other inequivalent 
conventions are possible e.g. in [4] gates are specified by a parameter 9 and a subset ii, . . . , i^ of 
lines, corresponding to the gate U = exp{i9Xi^ (8) . . . (g) Xj^,) which may thus act on 0{n) lines, 
although its (potentially exponentially many) diagonal entry phases ±0 are all equal up to sign. 

It will sometimes be convenient to represent an IQP circuit in terms of gates diagonal in 
the Z (or computational) basis. In this representation the inputs and outputs are the same as 
before but the circuit of gates is required to have the following structure: each qubit line begins 
and ends with a Hadamard (H) gate, and in between, every gate is diagonal in the Z basis. 
This is easily seen to be equivalent to the previous definition (by inserting two H's on each line 
between each pair of gates, recalling that HH = I, and then absorbing all iJ's into conjugation 
actions on the Z basis diagonal gates, leaving only X basis diagonal gates). 



As noted in definition 1 any uniform circuit family {C^} associates a probability distribution 
Pw to each bit string w and we will be especially interested to consider whether this distribution 
can be sampled (to suitable accuracy) by purely classical means in poly(n) time, given the 
classical description of the circuit C^. For this issue it will be significant to note the number of 
output lines, and especially its growth with n. 

2.4 Post-selected circuits 

An important theoretical tool in our arguments will be the notion of a post-selected (classical 
or quantum) circuit C. This is a circuit which, in addition to a specified register of output 
lines O, has a further (disjoint) specified register of post-selection lines V. Then instead of 
sampling measurement results x directly from the output lines with distribution prob [0 = 0;], 
we consider only those runs of the process for which a measurement on the post-selection lines 
yields 00 ... i.e. the output distribution on O is now taken to be the conditional distribution 
prob [O = x\V = 00 ... 0]. In this construction we also require the circuit C to have the property 
that prob ["P = 00 ... 0] 7^ so that the conditional probabilities are well defined: 

, r.^ ,_ , prohlO = x&iV = 00...0] 

pmhlO = xr = 00...0] = —-f:=z -, -■ (1) 

^ ^ ' ^ prob [-P = 00 ... 0] ^ ^ 

In practical terms a post-selected computation would be implemented by repeatedly running the 
computation and considering the output register only if the post-selection register is measured 
to yield 00 ... 0. Since we place no limit on how small the (non-zero) probability of the latter 
event may be, the post-selection process may incur an exponential overhead in time, and similar 
to the notion of a non-deterministic computation, it is principally of interest as a theoretical 
tool rather than as a feasible computational resource. 

Definition 3 A language L is in the class post-IQP (resp. post-BQP or post-BPP) iff there 
is an error tolerance < e < ^ and a uniform family {Cw} of post-selected IQP (resp. quan- 
tum or randomised classical) circuits with a specified single line output register O^ (for the 
L-membership decision problem) and a specified (generally O {poly (n)) -line) post-selection regis- 
ter Vw such that: 

(i) ifweL then prob [O^ = 1|P^ = 00 . . . 0] > 1 - e and 
(ii) ifw^L then prob [O^ = 0|■p^„ = 00 . . . 0] > 1 - e. 

It is pertinent to remark on the e-independence of the classes in definition 3 above. The basic 
bounded error classes BPP and BQP are well known to be independent of the error tolerance 
< e < i. Indeed the standard method [7, 8] for reducing e is to consider the majority vote 
answer of multiple runs of the circuit. Similarly post-BPP and post-BQP are easily seen to be 
independent of the error tolerance value too. The class post-IQP is in fact also independent 
of e, as will follow from theorem 1 below. However the class BIQP^ of languages decided with 
bounded error e by uniform families of IQP circuits (with no post-selection) is not known to be 
independent of e as it is not evident whether or not the majority vote function can be realised 
by just (commuting) IQP circuits. Fortunately we will not need to directly use BIQP^ in our 
arguments. 

Post-selected classical computation has been considered in [15, 16]. The class called BPPpath 
that is extensively studied in [16] is easily seen to be equal to our class post-BPP. 



For quantum computation, the class post-BQP was introduced and studied by Aaronson in 
[6] where is was shown that post-BQP equals the classical class PP. Note that if general quantum 
or classical circuits are available, it suffices (as in [G]) to use post-selection registers of only a 
single line, since for any register of k lines we may adjoin a circuit that computes some simple 
function / with f{xi . . . Xfc) = iff xi . . . Xfc = 00 . . . e.g. the OR of the k bit values suffices. 
However if the allowed gates are restricted (as in the case of IQP circuits) it may not be possible 
to compute any such function using only the allowed resources, and post-selection on multiple 
lines needs to be entertained, as in our definition above. 

2.5 Notions of classical simulation for quantum circuits 

There are various possible notions of classical simulation for quantum circuit families. For any 
uniform family {Cw} let Pw denote the output distribution of Cw and let n denote the length 
of w. 

(a) We say that a circuit family is strongly simulahle if any output probability in Pw and any 
marginal probability of P^ can be computed to m digits of precision in classical poly(n, m) time. 

(b) A circuit family is weakly simulahle if given the description of C^j, its output distribution P^ 
can be sampled by purely classical means in poly(n) time. Note that strong simulability implies 
weak simulability [13] - although the sample space of P^ is exponentially large in n we can sample 
the distribution in poly(n) time by successively sampling the bits; the binary distribution used 
for each successive bit is the conditional distribution, conditioned on the already seen values, 
and these two conditional probabilities can be computed in poly(n) time via Bayes' rule, as a 
quotient of two marginal probabilities of Pw ■ 

Next we have some notions of approximate classical simulation. 

(c) A circuit family is weakly simulahle with multiplicative error c > 1 if there is a family R^ 
of distributions (on the same sample spaces as P^) such that Rw can be sampled in classical 
poly(n) time and for all x and w we have 

- prob [Pw = x] < prob [Rw = x] < c prob [Pw = x] . (2) 

(d) A circuit family is weakly simulahle within e total variation distance if there is a family Rw 
as in (c) above, but with eq. (2) replaced by the condition 

yj Iprob [Pw = x] — prob [Rw = x][ < e. 

X 

(e) A further notion of approximate weak simulation has been formulated in [Jl]: recall first 
that the Chernoff-Hoeffding bound (cf Appendix of [11]) implies the following result - if we 
have a quantum process implementing Cw then by running it poly-many times we can (with 
probability exponentially close to 1) obtain an estimate p of any output probability p to within 
polynomial precision i.e for any polynomial /(n) we can output p such that jp — p| < l//(n). 
We say that a circuit family is (classically) weakly simulatable with additive polynomial error if 
the same estimates can be obtained from the circuit descriptions Cw by purely classical means 
in poly(n) time (and probability exponentially close to 1). Thus weak simulability implies weak 
simulability with additive polynomial error. 

Note that if a uniform circuit family Cw decides a language L with bounded error probability 
< e < 2 then the existence of a weak (resp. strong) simulation for Cw implies that L G BPP 



(resp. P). Similarly the existence of a weak simulation with additive polynomial error, or with 
multiplicative error 1 < c < 2(1 — e), will also imply that L E BPP. The latter condition on c 
serves to guarantee that R^ still decides L with a bounded error < e' < ^. 

3 Main results 

3.1 The power of IQP with post-selection 

We begin by examining how the availability of post-selection is able to boost the computational 
power of various classes of circuits. For this it is convenient to introduce some further notions 
from complexity theory. If A and B are complexity classes, A denotes the class A with an oracle 
for B (cf [7, 8] for formal definitions). We may think of A as the class of languages decided by 
the computations subject to the restrictions and acceptance criteria of A but allowing an extra 
new kind of computational step: we have an oracle or "subroutine" for any desired language L 
in B that may be queried at any stage in the course of the computation, and each such query 
counts as a single computational step i.e. bit strings may be generated as intermediate results 
and presented to the oracle, which in a single step, returns the information of whether the bit 
string is in L or not. The polynomial hierarchy class PH [7, 8] is defined to be the union of an 
infinite tower of increasing classes A^, k = 1, 2, . . ., in which Ai = P and A^+i = P^^k^ Here 
NAfc denotes the non-deterministic class associated to A^, in the same way that NP denotes the 
non-deterministic class associated to P, i.e. we allow the process to branch at each step into two 
separate computational paths and deem it to accept its input if and only if at least one path 
accepts. Further discussion and alternative characterisations of PH may be found in [7, 8]. 

For classical computation it is known [7, s] that BPP is contained in NA2 and also that 
post-BPP is contained in A3 [Ki]. Now for any complexity class C we have P*-^ ^ = P^ (since in 
the first expression any query to a P^ oracle can be replaced by a polynomial time computation 
with queries to the corresponding oracle for C). Hence we get 

ppost-BPP ^ pAg ^ ^^_ ^3) 

We will use this inclusion below in corollary 1. 

For the case of quantum computation it is not known whether BQP is contained within PH 
or not [1(1], but as mentioned above, Aaronson[6] has shown that post-BQP=PP. A theorem of 
Toda [9, 7, 8] asserts that PHCpPP so we get ppo'^t-BQP ^ pH. On the other hand we had 
ppost-BPP ^ ^^ gQ fpQjj^ g^]^ oracle perspective, the power of post-BPP is modest compared to 
post-BQP or PH. 

In view of the above considerations, and recalling that uniform families of IQP circuits 
are intuitively expected to be far weaker than general quantum computations (and even fail to 
include many computations in P, such as many elementary arithmetic operations that manifestly 
depend on the order of operations applied) our next result is perhaps unexpected. 

Theorem 1 post-IQP = post-BQP = PP. 

Proof. Clearly post-IQP C post-BQP and we show the reverse inclusion. Consider an arbitrary 
uniform quantum circuit family with inputs |0) . . . |0) and with gates drawn from the following 
universal set: H,Z,CZ and P = e*8 . (For a later purpose we point out here that all these 



gates are 1- or 2-qubit gates and apart from H, all gates have diagonal entries that are integer 
powers of e*'^'^.) If we are allowed to post-select such circuit families then we obtain post-BQP 
as the class of languages decideable with bounded error. Our strategy is to exhibit a direct 
reduction from any such post-selected circuit family to a post-selected IQP circuit family whose 
output conditional probabilities are the same as those of the original family. 

Firstly we add in extra H gates to ensure that every line begins and ends with an H gate. 
This is possible since H^ = I. Next consider in turn each intermediate H gate i.e. those that 
do not begin or end a line. For each such gate Ha acting on line a we include an extra qubit 
line labelled e (for "extra"). Consider now the following "Hadamard gadget" (somewhat akin 
to a gate teleportation) illustrated in figure 1. On lines ae initialised to |'0)a|O)g, where \^p) 
is any state, we apply the process 1^)^^ |0)g — )■ HaCZaeHe |V')a |0)g followed by post-selection of 
outcome on line a. An easy calculation shows that the resulting state on line e\s H\ip). In the 
original circuit we replace Ha by the Hadamard gadget; here |^) represents the circuit's general 
input state to Ha and subsequently line e is used as the output line of Ha for further gates in 
the original circuit. Alternatively we may extend the gadget by a SWAPae gate and use line 
a as output. SWAP is not a valid IQP gate so to obtain the final circuit we commute out all 
SWAP gates to the end of the lines. 

In the resulting circuit, the new line e is initialised to |0) and begins and ends with an H 
gate. Thus the non-diagonal intermediate H gate has been replaced by a new CZ gate and an 
additional post-selection. Performing this replacement for every intermediate H gate results in 
an IQP circuit with some extra post-selections on the new e lines, and with the same output 
conditional probabilities as originally (now conditioned on the new extra post-selections too). D 
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Figure 1: The Hadamard gadget for removal of intermediate H gates, (a) \a) represents a 
general input state to a gate U within the circuit that is followed by an intermediate H gate. 
(b) The lower line is a new ancillary qubit line. The original intermediate H gate may then 
be replaced by a new CZ gate, a post-selection (denoted by (0|) and two H gates that are now 
both at the ends of lines, as allowed in IQP circuit architecture. 

In the above construction, the post-BQP circuit that we start with, may without loss of 
generality, be assumed to comprise only nearest-neighbour 2-qubit gates. Then the SWAP 



operations introduced by the Hadamard gadgets will at first sight, result in a post-IQP circuit 
that is not truly nearest-neighbour. But by simply 'terminating some of the measurements 
early', and 'creating some ancillas late' we can avoid line crossings (as is evident from figure 
1(b)). The practical upshot of this is that the quantum part of the IQP process resulting from 
this construction can be rendered, logically speaking, by local interactions on a flat 2-dimensional 
surface (albeit still involving the inefficient resource of post-selection). 

3.2 Classical simulation of IQP circuits and collapse of PH 

Although IQP circuits have very simple ingredients, we now provide evidence (in corollary 1 be- 
low) that they nevertheless embody computational possibilities that are inaccessible to classical 
efficient (randomised) computation. 

Theorem 2 If the output probability distributions generated by uniform families of IQP circuits 
could be weakly classically simulated to within multiplicative error 1 < c < v2 then post-BPP = 
PP. 

Proof. We will show that under the stated simulation assumption, any language in post-IQP 
is in post-BPP and then theorem 1 (together with post-BPP C post-BQP) will give post-BPP 
= PP. 

Let L Gpost-IQP be any language decided with bounded error by a uniform family of post- 
selected IQP circuits C^ with (single line) output registers O^ and postselection registers Vw 

Introduce 

prob[0„, = x&P^ = 0...0] 

^-^") = prob[P.=0...0] ^^) 

so the bounded error condition states the following: 

if 1(7 G L then Sw{l) > \ + 5 , s 

if w;^Lthen5^(l) < |-(5 ^' 

for some < (5 < g- Furthermore post-IQP is independent of the level of error so for any L € 
post-IQP we may assume that eq. (5) holds for any choice of < 5 < |, however large. Now 
let y^ denote the full register of lines of C^ , comprising m lines say. If an output measurement 
on all lines of C«, can be weakly classically simulated to within multiplicative error c then there 
is a uniform family of classical randomised circuits C„, with output register 3^^ comprising m 
lines with 

- prob [yw=yi--- Vm] < prob [y^ = y^ . . . y„] < cprob [yw = Vi ■ ■ ■ Vm]- (6) 

Similarly all marginal distributions for corresponding sub-registers of 3^^, and 3^^ satisfy the 
same inequality. Let O^ and P^ denote the registers of C^ corresponding to O^ and Vw of C^ , 
and introduce 

- pToh[dw=xkVr, = 0...0] ,_, 

Jw{x) = = • (7) 

^ ^ prob[P^ = 0...0] ^ ^ 

Using the inequalities of eq. (6) (for the registers appearing in eq. (7)) we get 



-^Sw{x) < Sw{x) < (?Sw{x). (8) 



Combining this with eq. (5) we see that the classical uniform family C^ (post-selected on Vw) 
will decide L with bounded error if c^ < 1 + 26. Since 5 can be any value satisfying 5 < ^ we 
see that any value of c < v2 will suffice to guarantee that L G post-BPP. D 

It is interesting to point out that our use of a multiplicative approximation (cf eq. (6)) 
accords well with the quotient structure of the conditional probabilities in eqs. (4) and (7), 
allowing us to derive the bounding relationship eq. (8) between S^ and 5^. In contrast, use 
of an additive approximation or approximation to within e total variation distance would be 
problematic: the denominators of eqs. (4) and (7) are required only to be positive, so additive 
or total variation distance approximations would allow catastrophic divergences of the associated 
probability quotients. 

Corollary 1 // the output probability distributions generated by uniform families of IQP cir- 
cuits could be weakly classically simulated to within multiplicative error 1 < c < \/2 then the 
polynomial hierarchy would collapse to its third level i.e. PH = A3. 

Proof. Under the simulation assumption we may apply theorem 2, and Toda's theorem with 
eq. (3) gives PH C pP? = ppost-BPP c A3. □ 

From the proof of theorem 1 we see that it suffices in theorem 2 and corollary 1 to require 
the weak simulability condition only for a restricted kind of IQP circuit family, namely those 
comprising only 1- and 2-qubit gates with diagonal entries being only integer powers of e^^'^. In 
a similar vein one may ask whether the output register may be able to be restricted too, e.g. to 
having size only O(logn). Recall that for the class post-IQP, although we have only single-line 
output registers, the post-selection register may generally have size 0(poly(n)) and in the proof 
of theorem 2, the classical simulation needs to be applicable to IQP circuit families whith output 
registers of the latter size too (as they incorporate the original post-selection registers). Our 
next result shows that such restriction on the size of the output or post-selection register is not 
possible (on the assumption that PH does not collapse) i.e. we see that the computational power 
of post-selected IQP circuits (with a single line output register) depends crucially on the size of 
the post-selection register. 

Theorem 3 Let P^ be the output probability distributions for any uniform family of IQP circuits 
in which the output registers have size O(logn). Then Pyj may be sampled (without approxima- 
tion) by a classical randomised process that runs in time 0(poly(n)). 

Proof. Let C^ be any uniform family of IQP circuits with output registers Oyj of size M = 
O(logn). Let 3^^,, of size A^, denote the complementary register of all non-output lines and let 
X and y denote generic bit strings of lengths M and N respectively. We view Cw in its Z-basis 
diagonal representation: on input |0) . . . |0) the initial Hadamard gates on all lines create an 
equal superposition and after all Z-diagonal gates of the circuit (and just before the final round 
of Hadamard gates) the state has the form 

i*> = 7P5fE »'''"■■" 1^,!/). (9) 

x,y 

The phase function f(x,y) can be computed in classical poly(n) time by accumulating the 
relevant diagonal elements of the successive gates. Now the result of further gates and measure- 
ments on Ow is independent of measurements on the disjoint register y^. According to eq. (9) 
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a measurement of y^ will yield a uniformly random bit string of length N. Thus to classically 
simulate the output of the circuit we first classically choose a bit string yo uniformly at random 
and consider the state 



1 v^ 



"w/ ./;tI7Z^ 



2M 



^if{x,yo) U) 



Since |</)yo) is a state of only O(logn) qubits (i.e. poly(n) dimensions) we can classically strongly 
simulate the results of further gates and measurements on it, in poly(n) time by direct calcu- 
lation, giving overall an exact weak classical simulation of the original circuit family's output. 
D 

The methods developed in [11] may also be used to readily provide a weak classical simulation 
up to additive polynomial error for the families in the above theorem. 

Shepherd in [IS] gives a series of further classical simulation properties of IQP circuits. In 
particular it is shown there that the distributions P^ in the above theorem are not only exactly 
weakly simulable, but even more, they are classically strongly simulable, if all the gates are 
restricted to have diagonal entries of only integer powers of e*'^'^ (which suffices, as we have 
noted, to obtain the conclusions of theorems 1 and 2). 

As introduced in [4], we may consider a more general notion of an IQP assisted classical 
computation, than just the single use of the output of a uniform family of IQP circuits. Let 
IQV denote an oracle, which if given a description C of an IQP circuit, will obligingly return (in 
one computational step) a sample of C's output distribution. Then we may consider complexity 
classes such as BPP-^®^, defined as the class of languages decided with bounded error by a 
classical probabilistic polynomial-time computation where in addition to the usual classical steps, 
the computation may query the oracle with IQP circuit descriptions that have been produced 
as intermediate results along the way. Since any IQP circuit is a particular kind of quantum 
circuit, it is easy to see that BPP^GT' q ggp^ a,nd theorem 3 shows that BPP^2P[iogn] ^ gpp^ 
where BPP ^^ §"1 denotes that the oracle is queried only with IQP circuits having at most 
O(logn) output lines. 

4 Some further remarks 

It is interesting to note that the methods used to prove our principal results in theorem 2 and 
corollary 1 may be applied to other classes of circuits. The only feature of IQP that we needed 
was the result of theorem 1, that post-selection boosts its power to PP. Thus the evidence of 
hardness of classical simulation provided by corollary 1 would apply to any class of circuits that 
similarly goes to PP under post-selection. For example, the constructions in [13, 14] (exploiting 
the notion of gate teleportation [19]) imply that the power of quantum circuits of depth 4 (i.e. 3 
layers of unitary gates followed by a layer of measurements) with post-selection includes BQP and 
hence also post-BQP = PP, while quantum circuits of depth 3 are known to be always strongly 
classically simulable. More formally [14] introducing the class BQNC" of languages decided 
with bounded error by uniform families of constant depth circuits, we have post-BQNC'' = PP 
and the conclusion of our corollary 1 then applies to QNC'' (constant depth quantum circuits) 
replacing IQP. 
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